change factor
- North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
- Asia > China > Hong Kong (0.04)
- North America > United States > California > Alameda County > Berkeley (0.04)
- (3 more...)
- North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
- Asia > Japan > Honshū > Tōhoku > Iwate Prefecture > Morioka (0.04)
- North America > United States > California (0.04)
- (2 more...)
Factored Adaptation for Non-Stationary Reinforcement Learning
Dealing with non-stationarity in environments (e.g., in the transition dynamics) and objectives (e.g., in the reward functions) is a challenging problem that is crucial in real-world applications of reinforcement learning (RL). While most current approaches model the changes as a single shared embedding vector, we leverage insights from the recent causality literature to model non-stationarity in terms of individual latent change factors, and causal graphs across different environments. In particular, we propose Factored Adaptation for Non-Stationary RL (FANS-RL), a factored adaption approach that learns jointly both the causal structure in terms of a factored MDP, and a factored representation of the individual time-varying change factors. We prove that under standard assumptions, we can completely recover the causal graph representing the factored transition and reward function, as well as a partial structure between the individual change factors and the state components. Through our general framework, we can consider general non-stationary scenarios with different function types and changing frequency, including changes across episodes and within episodes. Experimental results demonstrate that FANS-RL outperforms existing approaches in terms of return, compactness of the latent state representation, and robustness to varying degrees of non-stationarity.
- North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
- Asia > China > Hong Kong (0.04)
- North America > United States > California > Alameda County > Berkeley (0.04)
- (3 more...)
- Information Technology > Artificial Intelligence > Robots (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
- North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
- Asia > Japan > Honshū > Tōhoku > Iwate Prefecture > Morioka (0.04)
- North America > United States > California (0.04)
- (2 more...)
Factored Adaptation for Non-Stationary Reinforcement Learning
Dealing with non-stationarity in environments (e.g., in the transition dynamics) and objectives (e.g., in the reward functions) is a challenging problem that is crucial in real-world applications of reinforcement learning (RL). While most current approaches model the changes as a single shared embedding vector, we leverage insights from the recent causality literature to model non-stationarity in terms of individual latent change factors, and causal graphs across different environments. In particular, we propose Factored Adaptation for Non-Stationary RL (FANS-RL), a factored adaption approach that learns jointly both the causal structure in terms of a factored MDP, and a factored representation of the individual time-varying change factors. We prove that under standard assumptions, we can completely recover the causal graph representing the factored transition and reward function, as well as a partial structure between the individual change factors and the state components. Through our general framework, we can consider general non-stationary scenarios with different function types and changing frequency, including changes across episodes and within episodes.
Factored Adaptation for Non-Stationary Reinforcement Learning
Feng, Fan, Huang, Biwei, Zhang, Kun, Magliacane, Sara
Dealing with non-stationarity in environments (e.g., in the transition dynamics) and objectives (e.g., in the reward functions) is a challenging problem that is crucial in real-world applications of reinforcement learning (RL). While most current approaches model the changes as a single shared embedding vector, we leverage insights from the recent causality literature to model non-stationarity in terms of individual latent change factors, and causal graphs across different environments. In particular, we propose Factored Adaptation for Non-Stationary RL (FANS-RL), a factored adaption approach that learns jointly both the causal structure in terms of a factored MDP, and a factored representation of the individual time-varying change factors. We prove that under standard assumptions, we can completely recover the causal graph representing the factored transition and reward function, as well as a partial structure between the individual change factors and the state components. Through our general framework, we can consider general non-stationary scenarios with different function types and changing frequency, including changes across episodes and within episodes. Experimental results demonstrate that FANS-RL outperforms existing approaches in terms of return, compactness of the latent state representation, and robustness to varying degrees of non-stationarity.
- North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
- Asia > China > Hong Kong (0.04)
- North America > United States > California > Alameda County > Berkeley (0.04)
- (3 more...)
- Information Technology (0.67)
- Health & Medicine (0.46)
Decision-making and Fuzzy Temporal Logic
This paper shows that the fuzzy temporal logic can model figures of thought to describe decision-making behaviors. In order to exemplify, some economic behaviors observed experimentally were modeled from problems of choice containing time, uncertainty and fuzziness. Related to time preference, it is noted that the subadditive discounting is mandatory in positive rewards situations and, consequently, results in the magnitude effect and time effect, where the last has a stronger discounting for earlier delay periods (as in, one hour, one day), but a weaker discounting for longer delay periods (for instance, six months, one year, ten years). In addition, it is possible to explain the preference reversal (change of preference when two rewards proposed on different dates are shifted in the time). Related to the Prospect Theory, it is shown that the risk seeking and the risk aversion are magnitude dependents, where the risk seeking may disappear when the values to be lost are very high.
- South America > Brazil (0.14)
- Asia > India (0.04)
- South America > Argentina > Patagonia > Río Negro Province > Viedma (0.04)
- (2 more...)